HDBTracker: Monitoring the Aggregates On Dynamic Hidden Web Databases

نویسندگان

  • Weimo Liu
  • Saad Bin Suhaim
  • Saravanan Thirumuruganathan
  • Nan Zhang
  • Gautam Das
  • Ali Jaoua
چکیده

Numerous web databases, e.g., amazon.com, eBay.com, are “hidden” behind (i.e., accessible only through) their restrictive search and browsing interfaces. This demonstration showcases HDBTracker, a web-based system that reveals and tracks (the changes of) userspecified aggregate queries over such hidden web databases, especially those that are frequently updated, by issuing a small number of search queries through the public web interfaces of these databases. The ability to track and monitor aggregates has applications over a wide variety of domains e.g., government agencies can track COUNT of openings at online job hunting websites to understand key economic indicators, while businesses can track the AVG price of a product over a basket of e-commerce websites to understand the competitive landscape and/or material costs. A key technique used in HDBTracker is RS-ESTIMATOR, the first algorithm that can efficiently monitor changes to aggregate query answers over a hidden web database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discover Aggregates Exceptions over Hidden Web Databases

Nowadays, many web databases “hidden” behind their restrictive search interfaces (e.g., Amazon, eBay) contain rich and valuable information that is of significant interests to various third parties. Recent studies have demonstrated the possibility of estimating/tracking certain aggregate queries over dynamic hidden web databases. Nonetheless, tracking all possible aggregate query answers to rep...

متن کامل

Aggregates Disclosure in Hidden Web Databases: an Urgent Challenge

Hidden web databases are widely prevalent on the Internet. Security issues specific to hidden databases, however, have been largely overlooked by the research community, possibly due to the (false) sense of security provided by the restrictive access (i.e., web interface) to such databases. We argue that an urgent challenge facing today’s hidden databases is the disclosure of sensitive aggregat...

متن کامل

Aggregate Estimation Over Dynamic Hidden Web Databases

Many databases on the web are “hidden” behind (i.e., accessible only through) their restrictive, form-like, search interfaces. Recent studies have shown that it is possible to estimate aggregate query answers over such hidden web databases by issuing a small number of carefully designed search queries through the restrictive web interface. A problem with these existing work, however, is that th...

متن کامل

DPro: A Probabilistic Approach for Hidden Web Database Selection Using Dynamic Probing

An ever increasing amount of valuable information is stored in Web databases, “hidden” behind search interfaces. To save the user’s effort in manually exploring each database, metasearchers automatically select the most relevant databases to a user’s query [2, 5, 16, 21, 26]. Existing methods use a pre-collected summary of each database to estimate its “relevancy” to the query, and return the d...

متن کامل

MetaQuerier over the Deep Web: Shallow Integration across Holistic Sources

The Web has been rapidly “deepened” by myriad searchable databases online. To enable effective access to the “deep Web,” we are building the MetaQuerier– for exploring and integrating databases on the Web. Such metaquerying must tackle integration at a large scale (as sources are proliferating online) and of a dynamic nature (as each query will access different sources). Toward such integration...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014